Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 3188 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 333 |
| Duplicate rows (%) | 10.4% |
| Total size in memory | 498.2 KiB |
| Average record size in memory | 160.0 B |
Variable types
| NUM | 13 |
|---|---|
| CAT | 7 |
Reproduction
| Analysis started | 2020-08-25 01:56:34.824064 |
|---|---|
| Analysis finished | 2020-08-25 01:57:04.770883 |
| Duration | 29.95 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
| Dataset has 333 (10.4%) duplicate rows | Duplicates |
A0 has 747 (23.4%) zeros | Zeros |
A36 has 814 (25.5%) zeros | Zeros |
A45 has 757 (23.7%) zeros | Zeros |
A13 has 803 (25.2%) zeros | Zeros |
A54 has 799 (25.1%) zeros | Zeros |
A33 has 1116 (35.0%) zeros | Zeros |
A48 has 723 (22.7%) zeros | Zeros |
A57 has 724 (22.7%) zeros | Zeros |
A46 has 730 (22.9%) zeros | Zeros |
A50 has 697 (21.9%) zeros | Zeros |
A31 has 596 (18.7%) zeros | Zeros |
A52 has 734 (23.0%) zeros | Zeros |
A40 has 722 (22.6%) zeros | Zeros |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0075282308657467 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 747 |
| Zeros (%) | 23.4% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.547878222 |
|---|---|
| Coefficient of variation (CV) | 0.7710368396 |
| Kurtosis | -1.609878221 |
| Mean | 2.007528231 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.01815575205 |
| Sum | 6400 |
| Variance | 2.395926992 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 875 | 27.4% | |
| 1 | 829 | 26.0% | |
| 0 | 747 | 23.4% | |
| 4 | 736 | 23.1% | |
| 2 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 747 | 23.4% | |
| 1 | 829 | 26.0% | |
| 2 | 1 | < 0.1% | |
| 3 | 875 | 27.4% | |
| 4 | 736 | 23.1% |
| Value | Count | Frequency (%) | |
| 4 | 736 | 23.1% | |
| 3 | 875 | 27.4% | |
| 2 | 1 | < 0.1% | |
| 1 | 829 | 26.0% | |
| 0 | 747 | 23.4% |
A5
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 1 | |
|---|---|
| 2 | |
| 3 | |
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 901 | 28.3% | |
| 2 | 821 | 25.8% | |
| 3 | 754 | 23.7% | |
| 0 | 712 | 22.3% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 901 | 28.3% | |
| 2 | 821 | 25.8% | |
| 3 | 754 | 23.7% | |
| 0 | 712 | 22.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 901 | 28.3% | |
| 2 | 821 | 25.8% | |
| 3 | 754 | 23.7% | |
| 0 | 712 | 22.3% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 901 | 28.3% | |
| 2 | 821 | 25.8% | |
| 3 | 754 | 23.7% | |
| 0 | 712 | 22.3% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 901 | 28.3% | |
| 2 | 821 | 25.8% | |
| 3 | 754 | 23.7% | |
| 0 | 712 | 22.3% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.679422835633626 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 814 |
| Zeros (%) | 25.5% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.411145084 |
|---|---|
| Coefficient of variation (CV) | 0.840255982 |
| Kurtosis | -0.935443104 |
| Mean | 1.679422836 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4829631137 |
| Sum | 5354 |
| Variance | 1.991330448 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 966 | 30.3% | |
| 0 | 814 | 25.5% | |
| 1 | 736 | 23.1% | |
| 4 | 670 | 21.0% | |
| 3 | 2 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 814 | 25.5% | |
| 1 | 736 | 23.1% | |
| 2 | 966 | 30.3% | |
| 3 | 2 | 0.1% | |
| 4 | 670 | 21.0% |
| Value | Count | Frequency (%) | |
| 4 | 670 | 21.0% | |
| 3 | 2 | 0.1% | |
| 2 | 966 | 30.3% | |
| 1 | 736 | 23.1% | |
| 0 | 814 | 25.5% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7057716436637391 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 757 |
| Zeros (%) | 23.7% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.42063481 |
|---|---|
| Coefficient of variation (CV) | 0.8328399733 |
| Kurtosis | -0.9753252616 |
| Mean | 1.705771644 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4963578294 |
| Sum | 5438 |
| Variance | 2.018203264 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 878 | 27.5% | |
| 1 | 843 | 26.4% | |
| 0 | 757 | 23.7% | |
| 4 | 709 | 22.2% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 757 | 23.7% | |
| 1 | 843 | 26.4% | |
| 2 | 878 | 27.5% | |
| 3 | 1 | < 0.1% | |
| 4 | 709 | 22.2% |
| Value | Count | Frequency (%) | |
| 4 | 709 | 22.2% | |
| 3 | 1 | < 0.1% | |
| 2 | 878 | 27.5% | |
| 1 | 843 | 26.4% | |
| 0 | 757 | 23.7% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6994981179422837 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 803 |
| Zeros (%) | 25.2% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.472700986 |
|---|---|
| Coefficient of variation (CV) | 0.8665505249 |
| Kurtosis | -1.094942119 |
| Mean | 1.699498118 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.5085788896 |
| Sum | 5418 |
| Variance | 2.168848195 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 898 | 28.2% | |
| 0 | 803 | 25.2% | |
| 4 | 772 | 24.2% | |
| 2 | 713 | 22.4% | |
| 3 | 2 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 803 | 25.2% | |
| 1 | 898 | 28.2% | |
| 2 | 713 | 22.4% | |
| 3 | 2 | 0.1% | |
| 4 | 772 | 24.2% |
| Value | Count | Frequency (%) | |
| 4 | 772 | 24.2% | |
| 3 | 2 | 0.1% | |
| 2 | 713 | 22.4% | |
| 1 | 898 | 28.2% | |
| 0 | 803 | 25.2% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6828732747804267 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 799 |
| Zeros (%) | 25.1% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.431018701 |
|---|---|
| Coefficient of variation (CV) | 0.8503425196 |
| Kurtosis | -0.9824211661 |
| Mean | 1.682873275 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.5080343702 |
| Sum | 5365 |
| Variance | 2.047814522 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 856 | 26.9% | |
| 1 | 826 | 25.9% | |
| 0 | 799 | 25.1% | |
| 4 | 706 | 22.1% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 799 | 25.1% | |
| 1 | 826 | 25.9% | |
| 2 | 856 | 26.9% | |
| 3 | 1 | < 0.1% | |
| 4 | 706 | 22.1% |
| Value | Count | Frequency (%) | |
| 4 | 706 | 22.1% | |
| 3 | 1 | < 0.1% | |
| 2 | 856 | 26.9% | |
| 1 | 826 | 25.9% | |
| 0 | 799 | 25.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.4523212045169385 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 1116 |
| Zeros (%) | 35.0% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.442959257 |
|---|---|
| Coefficient of variation (CV) | 0.9935538034 |
| Kurtosis | -0.7962542722 |
| Mean | 1.452321205 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.6893532309 |
| Sum | 4630 |
| Variance | 2.082131416 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 1116 | 35.0% | |
| 2 | 750 | 23.5% | |
| 1 | 719 | 22.6% | |
| 4 | 602 | 18.9% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 1116 | 35.0% | |
| 1 | 719 | 22.6% | |
| 2 | 750 | 23.5% | |
| 3 | 1 | < 0.1% | |
| 4 | 602 | 18.9% |
| Value | Count | Frequency (%) | |
| 4 | 602 | 18.9% | |
| 3 | 1 | < 0.1% | |
| 2 | 750 | 23.5% | |
| 1 | 719 | 22.6% | |
| 0 | 1116 | 35.0% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6941656210790463 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 723 |
| Zeros (%) | 22.7% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.384708669 |
|---|---|
| Coefficient of variation (CV) | 0.8173396107 |
| Kurtosis | -0.8702551444 |
| Mean | 1.694165621 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.5166374184 |
| Sum | 5401 |
| Variance | 1.917418099 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 940 | 29.5% | |
| 1 | 859 | 26.9% | |
| 0 | 723 | 22.7% | |
| 4 | 664 | 20.8% | |
| 3 | 2 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 723 | 22.7% | |
| 1 | 859 | 26.9% | |
| 2 | 940 | 29.5% | |
| 3 | 2 | 0.1% | |
| 4 | 664 | 20.8% |
| Value | Count | Frequency (%) | |
| 4 | 664 | 20.8% | |
| 3 | 2 | 0.1% | |
| 2 | 940 | 29.5% | |
| 1 | 859 | 26.9% | |
| 0 | 723 | 22.7% |
A12
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 1 | |
|---|---|
| 3 | |
| 2 | |
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 3 | 826 | 25.9% | |
| 2 | 759 | 23.8% | |
| 0 | 727 | 22.8% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 3 | 826 | 25.9% | |
| 2 | 759 | 23.8% | |
| 0 | 727 | 22.8% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 3 | 826 | 25.9% | |
| 2 | 759 | 23.8% | |
| 0 | 727 | 22.8% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 3 | 826 | 25.9% | |
| 2 | 759 | 23.8% | |
| 0 | 727 | 22.8% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 3 | 826 | 25.9% | |
| 2 | 759 | 23.8% | |
| 0 | 727 | 22.8% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7368255959849435 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 724 |
| Zeros (%) | 22.7% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.429681843 |
|---|---|
| Coefficient of variation (CV) | 0.8231579762 |
| Kurtosis | -1.024474791 |
| Mean | 1.736825596 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4796426465 |
| Sum | 5537 |
| Variance | 2.043990171 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 874 | 27.4% | |
| 2 | 848 | 26.6% | |
| 4 | 741 | 23.2% | |
| 0 | 724 | 22.7% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 724 | 22.7% | |
| 1 | 874 | 27.4% | |
| 2 | 848 | 26.6% | |
| 3 | 1 | < 0.1% | |
| 4 | 741 | 23.2% |
| Value | Count | Frequency (%) | |
| 4 | 741 | 23.2% | |
| 3 | 1 | < 0.1% | |
| 2 | 848 | 26.6% | |
| 1 | 874 | 27.4% | |
| 0 | 724 | 22.7% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7854454203262233 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 730 |
| Zeros (%) | 22.9% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.440509953 |
|---|---|
| Coefficient of variation (CV) | 0.8068070505 |
| Kurtosis | -1.089383931 |
| Mean | 1.78544542 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4031516818 |
| Sum | 5692 |
| Variance | 2.075068926 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 925 | 29.0% | |
| 4 | 769 | 24.1% | |
| 1 | 763 | 23.9% | |
| 0 | 730 | 22.9% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 730 | 22.9% | |
| 1 | 763 | 23.9% | |
| 2 | 925 | 29.0% | |
| 3 | 1 | < 0.1% | |
| 4 | 769 | 24.1% |
| Value | Count | Frequency (%) | |
| 4 | 769 | 24.1% | |
| 3 | 1 | < 0.1% | |
| 2 | 925 | 29.0% | |
| 1 | 763 | 23.9% | |
| 0 | 730 | 22.9% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.730865746549561 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 697 |
| Zeros (%) | 21.9% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.402749867 |
|---|---|
| Coefficient of variation (CV) | 0.8104325073 |
| Kurtosis | -0.9498822309 |
| Mean | 1.730865747 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4929266357 |
| Sum | 5518 |
| Variance | 1.967707189 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 901 | 28.3% | |
| 1 | 881 | 27.6% | |
| 4 | 708 | 22.2% | |
| 0 | 697 | 21.9% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 697 | 21.9% | |
| 1 | 881 | 27.6% | |
| 2 | 901 | 28.3% | |
| 3 | 1 | < 0.1% | |
| 4 | 708 | 22.2% |
| Value | Count | Frequency (%) | |
| 4 | 708 | 22.2% | |
| 3 | 1 | < 0.1% | |
| 2 | 901 | 28.3% | |
| 1 | 881 | 27.6% | |
| 0 | 697 | 21.9% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.342220828105395 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 596 |
| Zeros (%) | 18.7% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.61414724 |
|---|---|
| Coefficient of variation (CV) | 0.6891524578 |
| Kurtosis | -1.585411065 |
| Mean | 2.342220828 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.1914018206 |
| Sum | 7467 |
| Variance | 2.605471314 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 1429 | 44.8% | |
| 0 | 596 | 18.7% | |
| 2 | 586 | 18.4% | |
| 1 | 576 | 18.1% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 596 | 18.7% | |
| 1 | 576 | 18.1% | |
| 2 | 586 | 18.4% | |
| 3 | 1 | < 0.1% | |
| 4 | 1429 | 44.8% |
| Value | Count | Frequency (%) | |
| 4 | 1429 | 44.8% | |
| 3 | 1 | < 0.1% | |
| 2 | 586 | 18.4% | |
| 1 | 576 | 18.1% | |
| 0 | 596 | 18.7% |
A3
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 1 | |
|---|---|
| 2 | |
| 0 | |
| 3 |
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 2 | 801 | 25.1% | |
| 0 | 757 | 23.7% | |
| 3 | 754 | 23.7% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 2 | 801 | 25.1% | |
| 0 | 757 | 23.7% | |
| 3 | 754 | 23.7% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 2 | 801 | 25.1% | |
| 0 | 757 | 23.7% | |
| 3 | 754 | 23.7% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 2 | 801 | 25.1% | |
| 0 | 757 | 23.7% | |
| 3 | 754 | 23.7% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 876 | 27.5% | |
| 2 | 801 | 25.1% | |
| 0 | 757 | 23.7% | |
| 3 | 754 | 23.7% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.751254705144291 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 734 |
| Zeros (%) | 23.0% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.434015143 |
|---|---|
| Coefficient of variation (CV) | 0.8188501298 |
| Kurtosis | -1.047750744 |
| Mean | 1.751254705 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4504453481 |
| Sum | 5583 |
| Variance | 2.056399429 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 883 | 27.7% | |
| 1 | 822 | 25.8% | |
| 4 | 748 | 23.5% | |
| 0 | 734 | 23.0% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 734 | 23.0% | |
| 1 | 822 | 25.8% | |
| 2 | 883 | 27.7% | |
| 3 | 1 | < 0.1% | |
| 4 | 748 | 23.5% |
| Value | Count | Frequency (%) | |
| 4 | 748 | 23.5% | |
| 3 | 1 | < 0.1% | |
| 2 | 883 | 27.7% | |
| 1 | 822 | 25.8% | |
| 0 | 734 | 23.0% |
A17
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 | |
| 0 |
| Value | Count | Frequency (%) | |
| 3 | 915 | 28.7% | |
| 1 | 887 | 27.8% | |
| 2 | 736 | 23.1% | |
| 0 | 650 | 20.4% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 3 | 915 | 28.7% | |
| 1 | 887 | 27.8% | |
| 2 | 736 | 23.1% | |
| 0 | 650 | 20.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 3 | 915 | 28.7% | |
| 1 | 887 | 27.8% | |
| 2 | 736 | 23.1% | |
| 0 | 650 | 20.4% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 3 | 915 | 28.7% | |
| 1 | 887 | 27.8% | |
| 2 | 736 | 23.1% | |
| 0 | 650 | 20.4% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 3 | 915 | 28.7% | |
| 1 | 887 | 27.8% | |
| 2 | 736 | 23.1% | |
| 0 | 650 | 20.4% |
A8
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 1 | |
|---|---|
| 3 | |
| 0 | |
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 900 | 28.2% | |
| 3 | 797 | 25.0% | |
| 0 | 752 | 23.6% | |
| 2 | 739 | 23.2% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 900 | 28.2% | |
| 3 | 797 | 25.0% | |
| 0 | 752 | 23.6% | |
| 2 | 739 | 23.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 900 | 28.2% | |
| 3 | 797 | 25.0% | |
| 0 | 752 | 23.6% | |
| 2 | 739 | 23.2% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 900 | 28.2% | |
| 3 | 797 | 25.0% | |
| 0 | 752 | 23.6% | |
| 2 | 739 | 23.2% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 900 | 28.2% | |
| 3 | 797 | 25.0% | |
| 0 | 752 | 23.6% | |
| 2 | 739 | 23.2% |
A6
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 1 | |
|---|---|
| 0 | |
| 3 | |
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 860 | 27.0% | |
| 0 | 787 | 24.7% | |
| 3 | 782 | 24.5% | |
| 2 | 759 | 23.8% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 1 | 860 | 27.0% | |
| 0 | 787 | 24.7% | |
| 3 | 782 | 24.5% | |
| 2 | 759 | 23.8% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 860 | 27.0% | |
| 0 | 787 | 24.7% | |
| 3 | 782 | 24.5% | |
| 2 | 759 | 23.8% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1 | 860 | 27.0% | |
| 0 | 787 | 24.7% | |
| 3 | 782 | 24.5% | |
| 2 | 759 | 23.8% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 1 | 860 | 27.0% | |
| 0 | 787 | 24.7% | |
| 3 | 782 | 24.5% | |
| 2 | 759 | 23.8% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7754077791718945 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 722 |
| Zeros (%) | 22.6% |
| Memory size | 25.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.436796454 |
|---|---|
| Coefficient of variation (CV) | 0.8092768721 |
| Kurtosis | -1.073609599 |
| Mean | 1.775407779 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4233100258 |
| Sum | 5660 |
| Variance | 2.064384051 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 903 | 28.3% | |
| 1 | 799 | 25.1% | |
| 4 | 763 | 23.9% | |
| 0 | 722 | 22.6% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 722 | 22.6% | |
| 1 | 799 | 25.1% | |
| 2 | 903 | 28.3% | |
| 3 | 1 | < 0.1% | |
| 4 | 763 | 23.9% |
| Value | Count | Frequency (%) | |
| 4 | 763 | 23.9% | |
| 3 | 1 | < 0.1% | |
| 2 | 903 | 28.3% | |
| 1 | 799 | 25.1% | |
| 0 | 722 | 22.6% |
target
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.0 KiB |
| 2 | |
|---|---|
| 1 | |
| 0 |
| Value | Count | Frequency (%) | |
| 2 | 1655 | 51.9% | |
| 1 | 769 | 24.1% | |
| 0 | 764 | 24.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 2 | 1655 | 51.9% | |
| 1 | 769 | 24.1% | |
| 0 | 764 | 24.0% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 3188 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 1655 | 51.9% | |
| 1 | 769 | 24.1% | |
| 0 | 764 | 24.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 3188 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 2 | 1655 | 51.9% | |
| 1 | 769 | 24.1% | |
| 0 | 764 | 24.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 3188 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 2 | 1655 | 51.9% | |
| 1 | 769 | 24.1% | |
| 0 | 764 | 24.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| A0 | A5 | A36 | A45 | A13 | A54 | A33 | A48 | A12 | A57 | A46 | A50 | A31 | A3 | A52 | A17 | A8 | A6 | A40 | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2 | 1 | 2 | 0 | 4 | 2 | 4 | 2 | 4 | 4 | 0 | 4 | 1 | 4 | 3 | 1 | 1 | 2 | 0 |
| 1 | 3 | 2 | 1 | 4 | 4 | 4 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 3 | 2 | 3 | 1 | 1 | 4 | 1 |
| 2 | 0 | 2 | 2 | 4 | 1 | 4 | 1 | 1 | 1 | 2 | 4 | 4 | 4 | 2 | 4 | 2 | 2 | 3 | 0 | 2 |
| 3 | 4 | 1 | 2 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 1 | 4 | 1 | 1 | 1 | 1 | 1 | 3 | 1 | 2 |
| 4 | 0 | 0 | 1 | 0 | 0 | 4 | 0 | 2 | 0 | 4 | 2 | 1 | 0 | 3 | 1 | 0 | 0 | 2 | 2 | 2 |
| 5 | 1 | 1 | 4 | 2 | 1 | 1 | 0 | 0 | 1 | 0 | 2 | 4 | 1 | 1 | 1 | 2 | 1 | 3 | 2 | 1 |
| 6 | 0 | 0 | 4 | 4 | 1 | 0 | 4 | 0 | 0 | 4 | 1 | 1 | 0 | 2 | 1 | 3 | 3 | 0 | 0 | 2 |
| 7 | 1 | 2 | 4 | 1 | 1 | 0 | 2 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 2 | 1 | 1 | 2 | 4 | 2 |
| 8 | 4 | 1 | 0 | 2 | 1 | 0 | 4 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 3 | 3 | 3 | 4 | 2 |
| 9 | 1 | 1 | 0 | 0 | 2 | 2 | 0 | 1 | 0 | 4 | 4 | 0 | 2 | 1 | 4 | 1 | 1 | 3 | 4 | 2 |
Last rows
| A0 | A5 | A36 | A45 | A13 | A54 | A33 | A48 | A12 | A57 | A46 | A50 | A31 | A3 | A52 | A17 | A8 | A6 | A40 | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3178 | 0 | 3 | 2 | 1 | 2 | 4 | 0 | 4 | 2 | 2 | 1 | 2 | 4 | 3 | 1 | 0 | 3 | 3 | 2 | 0 |
| 3179 | 1 | 2 | 1 | 4 | 1 | 4 | 2 | 2 | 2 | 2 | 2 | 4 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 2 |
| 3180 | 3 | 2 | 4 | 1 | 2 | 2 | 4 | 1 | 2 | 2 | 4 | 1 | 2 | 0 | 2 | 3 | 2 | 2 | 0 | 2 |
| 3181 | 3 | 1 | 0 | 4 | 2 | 2 | 1 | 2 | 2 | 1 | 2 | 2 | 4 | 2 | 1 | 3 | 2 | 2 | 2 | 0 |
| 3182 | 4 | 0 | 0 | 0 | 1 | 4 | 1 | 4 | 0 | 2 | 0 | 1 | 1 | 2 | 1 | 1 | 1 | 0 | 2 | 2 |
| 3183 | 4 | 2 | 2 | 1 | 2 | 4 | 4 | 0 | 0 | 4 | 0 | 1 | 2 | 2 | 1 | 3 | 3 | 1 | 2 | 2 |
| 3184 | 0 | 1 | 2 | 2 | 1 | 1 | 4 | 2 | 2 | 1 | 2 | 4 | 2 | 3 | 0 | 2 | 3 | 3 | 2 | 2 |
| 3185 | 4 | 1 | 2 | 2 | 0 | 1 | 0 | 1 | 2 | 2 | 0 | 1 | 4 | 0 | 2 | 0 | 2 | 0 | 2 | 0 |
| 3186 | 1 | 2 | 4 | 1 | 1 | 4 | 4 | 2 | 1 | 0 | 4 | 1 | 0 | 1 | 4 | 3 | 3 | 1 | 1 | 1 |
| 3187 | 4 | 1 | 4 | 2 | 1 | 4 | 4 | 1 | 1 | 1 | 4 | 4 | 4 | 1 | 4 | 2 | 3 | 1 | 0 | 0 |
Most frequent
| A0 | A5 | A36 | A45 | A13 | A54 | A33 | A48 | A12 | A57 | A46 | A50 | A31 | A3 | A52 | A17 | A8 | A6 | A40 | target | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 152 | 3 | 1 | 2 | 0 | 1 | 0 | 1 | 1 | 3 | 2 | 2 | 1 | 2 | 1 | 2 | 1 | 1 | 3 | 0 | 1 | 8 |
| 140 | 3 | 0 | 4 | 1 | 2 | 2 | 2 | 2 | 0 | 2 | 0 | 1 | 4 | 0 | 0 | 0 | 1 | 3 | 2 | 0 | 7 |
| 65 | 1 | 0 | 2 | 1 | 1 | 1 | 1 | 0 | 3 | 2 | 1 | 2 | 1 | 3 | 0 | 3 | 1 | 0 | 2 | 1 | 6 |
| 105 | 1 | 2 | 2 | 2 | 4 | 0 | 0 | 2 | 0 | 1 | 2 | 1 | 4 | 2 | 1 | 0 | 0 | 2 | 1 | 0 | 6 |
| 1 | 0 | 0 | 0 | 1 | 1 | 4 | 2 | 2 | 3 | 0 | 1 | 1 | 0 | 3 | 0 | 1 | 2 | 0 | 4 | 1 | 5 |
| 63 | 1 | 0 | 1 | 2 | 1 | 0 | 0 | 1 | 3 | 2 | 2 | 2 | 0 | 3 | 2 | 3 | 1 | 2 | 4 | 1 | 5 |
| 89 | 1 | 1 | 4 | 1 | 4 | 0 | 2 | 2 | 1 | 1 | 0 | 0 | 4 | 1 | 2 | 1 | 1 | 0 | 2 | 1 | 5 |
| 122 | 1 | 3 | 2 | 1 | 4 | 2 | 1 | 2 | 1 | 4 | 4 | 2 | 4 | 1 | 1 | 3 | 3 | 3 | 0 | 1 | 5 |
| 170 | 3 | 2 | 2 | 0 | 0 | 2 | 0 | 1 | 3 | 0 | 4 | 4 | 4 | 1 | 4 | 2 | 1 | 1 | 1 | 0 | 5 |
| 172 | 3 | 2 | 2 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 4 | 1 | 5 |